The Computational-Linguistic Approach to Forensic Authorship Attribution

نویسنده

  • Carole E. Chaski
چکیده

This article examines the diversity of methods in authorship attribution through a lens which focuses attention on a single common element. The current state of authorship attribution study is spread throughout so many academic and non -academic disciplines that it is nigh impossible to describe all of the various assumptions about language and authorship. The disciplines involved in authorship attributions range over Classics, Biblical exegesis, Paleography, Communication and Rhetoric, English literary criticism, Handwriting examination, General Linguistics, Sociolinguistics, Computational Linguistics, Statistics, and Machine Learning. Given the breadth of this list, it is no wonder that the current state of authorship attribution appears to be a jumbled mass of multiple contrasts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Best Practices and Admissibility of Forensic Author Identification

Forensic linguistics provides answers to four categories of inquiry in investigative and legal settings: (i) identification of author, language, or speaker; (ii) intertextuality, or the relationship between texts; (iii) text-typing or classification of text types such as threats, suicide notes, or predatory chat; and (iv) linguistic profiling to assess the author’s dialect, native language, age...

متن کامل

Explaining Delta, or: How do distance measures for authorship attribution work?

Authorship Attribution is a research area in quantitative text analysis concerned with attributing texts of unknown or disputed authorship to their actual author based on quantitatively measured linguistic evidence (see Juola 2006; Stamatatos 2009; Koppel et al. 2009). Authorship attribution has applications in literary studies, history, forensics and many other fields, e.g. corpus stylistics (...

متن کامل

The effect of author set size and data size in authorship attribution

Applications of authorship attribution ‘in the wild’ [Koppel, M., Schler, J., and Argamon, S. (2010). Authorship attribution in the wild. Language Resources and Evaluation. Advanced Access published January 12, 2010:10.1007/ s10579-009-9111-2], for instance in social networks, will likely involve large sets of candidate authors and only limited data per author. In this article, we present the r...

متن کامل

A Fuzzy Logic Approach to Computer Software Source Code Authorship Analysis

Software source code authorship analysis has become an important area in recent years with promising applications in both the legal sector (such as proof of ownership and software forensics) and the education sector (such as plagiarism detection and assessing style). Authorship analysis encompasses the sub-areas of author discrimination, author characterization, and similarity detection (also r...

متن کامل

Authorship Identification in Large Email Collections: Experiments Using Features that Belong to Different Linguistic Levels - Notebook for PAN at CLEF 2011

The aim of this paper is to explore the usefulness of using features from different linguistic levels to email authorship identification. Using various email datasets provided by PAN’11 lab we tested several feature groups in both authorship attribution and authorship verification subtasks. The selected feature groups combined with Regularized Logistic Regression and One-Class SVMmachine learni...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006